“TrendMicro-Chang-MC1”

VAST 2012 Challenge
Mini-Challenge 1: Bank of Money Enterprise: Cyber Situation Awareness

Team Members:

Hanwen Chang, Trend Micro, hanwen_chang@trend.com.tw PRIMARY

Junyu Chen, Trend Micro, junyu_chen@trend.com.tw

Jeff C Huang, Trend Micro, jeff_c_huang@trend.com.tw

Beti Chiang, Trend Micro, beti_chiang@trend.com.tw

Mear Kuo, Trend Micro, mear_kuo@trend.com.tw

Student Team: No

Tool(s):

Microsoft Excel

Notepad++

Tableau

Video:

VAST Challenge 2012

Answers to Mini-Challenge 1 Questions:

MC 1.1 Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe?

To monitor the current status of each machine, we designed an interactive widget to display machine types, machine activity, and policy status on map which highlights the ones with abnormal activity and policy deviation.

Because the regions of Bank of Money are not segmented by the physical boundaries of the countries, we designed the background to show the boundaries of each region instead of countries, so it is clear seen when there are any region-scale anomalies. We also use the light-dark shadow to represent the time zones of business hours and non-business hours. For the data points, we choose shapes to distinguish each type of machines, color hues to represent activities, and point size to reflect the severity level of policy deviation. As a result, administrators, from the map, can quickly identify the machines with severer policy deviation, and the ATMs and servers with potential illegal login trial or deny of service attack.

From the health and policy status as of 2pm BMT on February 2, we observed two general findings.

First, although staffs are encouraged to turn off workstations at night, we discovered that most of the workstations were not turned off during non-business hour.

Switching to tabular view, it is clear that only 67 out of 40178 (0.17%) workstations were offline. Luckily, none of them were with consecutive login failures, CPU fully utilization, or external device attached.

Second, we observed that none of machines in region 5 and 10 were in healthy state; all machines in these two regions were at least with a moderate policy deviation or more. Comparing with other big regions, the situation was not normal. Potential causes may be regional policy update and deployment issue, which should be double confirmed with regional IT administrators.

MC 1.2 Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?

For trend analysis, we started with the average number of connections along the time and tried to find out if some machines had unusual peaks of connections during the two days. The average number of connections was compared between regions at the same time of a day. Line chart was selected to visualize the data.

The X-axis represented time and the y-axis represented number of connections. Colors were used to indicate regions. From the line chart, we observed that almost all regions had similar average number of connections at the same time in a day. However, workstations in Region 10 had obviously more connections than all the other regions during 2AM to 5AM, Feb 3 (local time). While workstations in other region had average 5 connections during non-business hours, workstations in Region 10 had 15 connections during 3 hours of the non-business hours. To further investigate the details, we compared the data between each branch in Region 10.

The drilled down line chart showed that all branches had more than average connections during these three hours. The phenomenon was not a specific case for any branch.

Second, we leveraged the approach conducted in MC 1.1 to generate an overview of policy status and activity flag trend. By showing the data in each timeframe in a series, we were able to learn the changes during the two given days. Stating with 2PM in Feb 2nd, the data was sampled every 6 hours. The map showed that the overall policy status was getting worse and worse.

At 2PM on Feb 2^nd, only one server in Region 2 had possible virus infection.

However after 6 hours, 16 regions, which is one-third of all regions, had possible virus infection.

At 8AM on Feb 3^rd, all regions except 3 among them had possible virus infection.

Finally, at 8AM on Feb 4^th, all regions had more than 20 machines with possible virus infection.

The degree of policy status worseness could also be found in the line charts.

The percentage of healthy machines among all machines dropped from 90% to 42%, which means half of the machines turned unhealthy. Machines suffering from moderate policy deviation increase from 10% to 40%. 18% of all machines exhibited serious or critical policy deviations, or had possible virus infections.